ESS330 Daily Assignment 21

Lecture 21: Introduction to Time Series Data in ‘R’

Author

Neva Morgan

Published

April 21, 2025

Objective:

In this activity, you will download streamflow data from the Cache la Poudre River (USGS site 06752260) and analyze it using a few time series methods.

Setting Up:

library(zoo)
Warning: package 'zoo' was built under R version 4.4.3

Attaching package: 'zoo'
The following objects are masked from 'package:base':

    as.Date, as.Date.numeric
library(timeSeries) 
Warning: package 'timeSeries' was built under R version 4.4.3
Loading required package: timeDate

Attaching package: 'timeSeries'
The following object is masked from 'package:zoo':

    time<-
The following objects are masked from 'package:graphics':

    lines, points
# For some reason the ts package wouldn't download due to it being out of date or either my RStudio is out of date?
library(xts)
Warning: package 'xts' was built under R version 4.4.3
library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.4.3
Warning: package 'ggplot2' was built under R version 4.4.3
Warning: package 'tidyr' was built under R version 4.4.3
Warning: package 'readr' was built under R version 4.4.3
Warning: package 'purrr' was built under R version 4.4.3
Warning: package 'dplyr' was built under R version 4.4.3
Warning: package 'lubridate' was built under R version 4.4.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks timeSeries::filter(), stats::filter()
✖ dplyr::first()  masks xts::first()
✖ dplyr::lag()    masks timeSeries::lag(), stats::lag()
✖ dplyr::last()   masks xts::last()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(lubridate)
library(tidymodels)
Warning: package 'tidymodels' was built under R version 4.4.3
── Attaching packages ────────────────────────────────────── tidymodels 1.3.0 ──
✔ broom        1.0.8     ✔ rsample      1.3.0
✔ dials        1.4.0     ✔ tune         1.3.0
✔ infer        1.0.8     ✔ workflows    1.2.0
✔ modeldata    1.4.0     ✔ workflowsets 1.1.0
✔ parsnip      1.3.1     ✔ yardstick    1.3.2
✔ recipes      1.3.0     
Warning: package 'broom' was built under R version 4.4.3
Warning: package 'dials' was built under R version 4.4.3
Warning: package 'infer' was built under R version 4.4.3
Warning: package 'parsnip' was built under R version 4.4.3
Warning: package 'recipes' was built under R version 4.4.3
Warning: package 'rsample' was built under R version 4.4.3
Warning: package 'tune' was built under R version 4.4.3
Warning: package 'workflows' was built under R version 4.4.3
Warning: package 'yardstick' was built under R version 4.4.3
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ scales::discard() masks purrr::discard()
✖ dplyr::filter()   masks timeSeries::filter(), stats::filter()
✖ dplyr::first()    masks xts::first()
✖ recipes::fixed()  masks stringr::fixed()
✖ dplyr::lag()      masks timeSeries::lag(), stats::lag()
✖ dplyr::last()     masks xts::last()
✖ yardstick::spec() masks readr::spec()
✖ recipes::step()   masks stats::step()
library(ggplot2)
library(tsibble)
Warning: package 'tsibble' was built under R version 4.4.3
Registered S3 method overwritten by 'tsibble':
  method               from 
  as_tibble.grouped_df dplyr

Attaching package: 'tsibble'

The following object is masked from 'package:lubridate':

    interval

The following object is masked from 'package:zoo':

    index

The following objects are masked from 'package:base':

    intersect, setdiff, union
library(feasts)
Warning: package 'feasts' was built under R version 4.4.3
Loading required package: fabletools
Warning: package 'fabletools' was built under R version 4.4.3

Attaching package: 'fabletools'

The following object is masked from 'package:yardstick':

    accuracy

The following object is masked from 'package:parsnip':

    null_model

The following objects are masked from 'package:infer':

    generate, hypothesize
library(dplyr)

First, use this code to download the data from the USGS site.

library(dataRetrieval)
Warning: package 'dataRetrieval' was built under R version 4.4.3
# Example: Cache la Poudre River at Mouth (USGS site 06752260)
poudre_flow <- readNWISdv(siteNumber = "06752260",   # Download data from USGS for site 06752260
                          parameterCd = "00060",     # Parameter code 00060 = discharge in cfs)
                          startDate = "2013-01-01",  # Set the start date
                          endDate = "2023-12-31") |> # Set the end date
  renameNWISColumns() |> # Rename columns to standard names (e.g., "Flow","Date")
  mutate(Date = yearmonth(Date)) |> # Convert daily Date values into a year-month format (e.g., "2023 Jan")
  group_by(Date) |> # Group the data by the new monthly Date
  summarise(Flow = mean(Flow)) # Calculate the average daily flow for each month
GET:https://waterservices.usgs.gov/nwis/dv/?site=06752260&format=waterml%2C1.1&ParameterCd=00060&StatCd=00003&startDT=2013-01-01&endDT=2023-12-31

Assignment:

1. Convert to tsibble

Use as_tsibble() to convert the data.frame into a tsibble object. This will allow you to use the feast functions for time series analysis.

pf_tbl <- as_tsibble(poudre_flow)
Using `Date` as index variable.
head(pf_tbl)
# A tsibble: 6 x 2 [1M]
      Date   Flow
     <mth>  <dbl>
1 2013 Jan  18.1 
2 2013 Feb  18.0 
3 2013 Mar   8.21
4 2013 Apr   5.94
5 2013 May 333.  
6 2013 Jun 300.  

2. Plotting the time series

Use ggplot to plot the time series data. Animate this plot with plotly

#Setting up for Plotting

library(plotly)
Warning: package 'plotly' was built under R version 4.4.3

Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

    last_plot
The following object is masked from 'package:timeSeries':

    filter
The following object is masked from 'package:stats':

    filter
The following object is masked from 'package:graphics':

    layout
pf_plot <- pf_tbl |>
  autoplot() +
  geom_line() +
  labs(title = "Interactive Poudre Flow Time Series",
       x = "Date",
       y = "Flow",
       subtitle = "ESS330 A-21 | Neva Morgan")
Plot variable not specified, automatically selected `.vars = Flow`
ggplotly(pf_plot)

3. Subseries

Describe what you see in the plot. How are “seasons” defined in this plot? What do you think the “subseries” represent?

After plotting using the gg_subseries, the monthly flow rate for the Poudre River, appears to be at a higher level during May and June months, with an occasional increase of flow during April or September.

Seasons within this plot are defined by the months that are correlated with similar flow rate measurements to one another, the larger increae of flow could represent the end of spring moving into summer months (April - September).

From what we’ve learned from class, “subseries” are represented by the different years within the months of the data, showing how flow has changed from each month with multiple years being compared to one another.

4. Decompose

Use the model(STL(…)) pattern to decompose the time series data into its components: trend, seasonality, and residuals. Chose a window that you feel is most appropriate to this data…

Describe what you see in the plot. How do the components change over time? What do you think the trend and seasonal components represent?

Submission:

Upload a rendered qmd file to the course website. Make sure to include your code and any plots you created.

This should be an HTML file with self-contained: true.

It should not point to a local host, and must be the physical file.

Make sure to include your code and any plots you created and that the outputs render as you expect.